A comprehensive dataset of photonic features on spectral converters for energy harvesting

Building integrated photovoltaics is a promising strategy for solar technology, in which luminescent solar concentrators (LSCs) stand out. Challenges include the development of materials for sunlight harvesting and conversion, which is an iterative optimization process with several steps: synthesis, processing, and structural and optical characterizations before considering the energy generation figures of merit that requires a prototype fabrication. Thus, simulation models provide a valuable, cost-effective, and time-efficient alternative to experimental implementations, enabling researchers to gain valuable insights for informed decisions. We conducted a literature review on LSCs over the past 47 years from the Web of ScienceTM Core Collection, including published research conducted by our research group, to gather the optical features and identify the material classes that contribute to the performance. The dataset can be further expanded systematically offering a valuable resource for decision-making tools for device design without extensive experimental measurements.


Background & Summary
The luminescent solar concentrator (LSC) concept (Fig. 1a) dates from the late 70 s 1,2 , but major advances occurred over the last twenty years.Nowadays, LSCs are seen as an urban architecture strategy to integrate solar-harvesting devices into buildings, Fig. 1a,b 3 .This was greatly fostered by the introduction of the Zero-Energy Building (ZEB) concept and related United Nations and European Union directives [4][5][6] .The implementation of ZEBs implies an optimized use of renewable energy sources which draws attention to solutions that may easily contribute to the energy efficiency of buildings, through existing infrastructures and, thus, LSCs gained renewed importance over the last decade, with real-life demonstrators being developed and implemented (e.g.highway sound barriers [7][8][9][10] and agrivoltaic applications 11,12 ) and companies being founded (e.g.Glass to Power 13 , UbiQD 14 and ClearVue PV ) 15 .A step further on LSC development was the recently reported approach including additional sensing abilities to LSCs to behave as sunlight-powered optical temperature sensors 16,17 , which would make possible the optimization of heating/cooling systems without the need for additional energy-consuming sensors or systems, enabling substantial long-term benefits for society concerning energy consumption habits.Moreover, material science has evolved hugely over the last years in terms of achieving optically active materials with high absorption and conversion ability and small overlap between the absorption and emission spectra to prevent re-absorption losses (e.g.large Stokes-shift 18,19 defined for organic molecules), which enabled the fabrication of large-area devices 18,[20][21][22][23][24][25][26][27][28][29] .An LSC consists of planar waveguides that are either doped or coated with emissive materials.These materials absorb sunlight and re-emit it at distinct wavelengths that match the operating spectral region of the photovoltaic (PV) cells.The emitted light is then guided through total internal reflection towards photovoltaic cells coupled to the edges of the waveguides, where it is converted into electricity.
The optical conversion efficiency (η opt ) is widely recognized as the primary figure of merit to evaluate the performance of LSCs.It quantifies the ratio of the generated output optical power (P out ) to the incident optical power (P in ), providing a measure of how effectively the materials convert incoming light into usable optical signal 19 .This figure of merit serves as a crucial benchmark for evaluating the efficacy of various approaches and optimizing LSC configurations to enhance overall efficiency.
Another parameter commonly used to quantify performance in terms of light harvesting and energy conversion is the power conversion efficiency (PCE).The PCE measures the ratio of the generated electrical power (P out el ) to P in , taking into account the specific characteristics of the coupled photovoltaic cell.The PCE provides a more comprehensive assessment of the LSC's performance by considering the electrical power and the photovoltaic cell's efficiency.
Despite enormous efforts, the improvement in these figures of merit is somewhat limited.Unless the intrinsic limitations, such as low absorption efficiency coefficient translated into weak radiation-harvesting capability, large self-absorption quantified by the spectral overlap between the emission and the absorption spectrum, and poor conversion efficiency quantified by a low emission quantum yield η yield can be solved, it seems unrealistic to use these materials for solar energy conversion and to be active in climate change-related actions and substantial long-term benefits for society.In addition, self-absorption quantified by the overlap integral OI 30 between the absorption and emission spectra (in some cases presented as the modified overlap integral OI* 31,32 if normalized to the emission spectra) has been pointed out as one of the most critical aspects for the device performance 3,19,[30][31][32][33][34] , although its quantification is available in very few works [30][31][32] .Nevertheless, bearing in mind the final goal of large-scale implementation in real applications, transparency and visible light transmittance are also key factors when thinking of replacing windows with such devices.Thus, a balance between the visual comfort of the building occupants and electrical output should be achieved.
In 1988, John Maddox wrote, "One of the continuing scandals of physical science is that it remains, in general, impossible to predict the structure of even the simplest crystalline solids from a knowledge of their chemical composition" 35 .While facing some evolution nowadays, predicting the crystal structure based solely on the composition remains challenging and entails high computational costs.An even more glaring example is the a priori prediction of materials compositions from the massive amount of produced and published experimental data, for a given set of target applications, as the rationalization of materials is exceptionally difficult.
Although several reviews on LSCs have been published over the years 3,19,33,[36][37][38][39][40][41][42][43][44][45][46][47] , mostly concerning the type of luminescent materials in use and LSC configurations and applications, this dataset intends to go further and be a starting point to achieve a massive compilation of relevant features concerning optically active materials used to fabricate LSCs, which can be helpful for researchers working in the field.Also, this dataset has the potential to promote much-needed standardization in the reporting of figures of merit and characterization procedures for LSC devices, which is a concern 48,49 .By establishing consistent reporting practices, researchers and industry professionals can ensure comparability, reproducibility, and effective collaboration in the field.

Methods
The dataset was collated from the community of researchers or research groups working in the development of LSC and all data sources are cited 1,11,[16][17][18][20][21][22][23][24][25][26][27][28][29][30]32, . The first paperreporting the concept of LSC dates from 1976 1,2 , setting the starting point for the literature review behind the dataset.This literature review starting over the past 47 years on the field of spectral energy conversion was made using CitNetExplorer and VOSViewer tools.The sample data consisting of the information from 1474 published articles, letters, reviews, and books from the Web of Science TM Core Collection containing the following citation indexes: SCI-EXPANDED, SSCI, A&HCI, CPCI-S, CPCI-SSH, ESCI, CCR-EXPANDED, and IC, using the search terms in all fields: luminescent solar concentrator or fluorescent collector or greenhouse collector in the period from 1976 to 2023, accessed on November 16, 2023.This approach has been successfully used in the field of optical sensing 232,233 . Fro each article, information was extracted such as authors' names, affiliation and funding entity, the document title, keywords, abstract, and reference list, the publication citations and date, and the journal information, allowing for analyzing these fields in a multitude of parameters.We explored these fields by creating a map based on text categorical data (Fig. 2), which means that the abstracts, keywords, and titles were scanned for terms or verified whether a term is present or not (binary counting) and if it has a link with some other terms (both appearing in the same document).If the term appeared in a document, it is counted as one occurrence and if two terms appear together in the same document (co-occurrence), a link is created between them.The number of occurrences of a term was represented by the relative size of its circle.The categories, features and terms included in the dataset were chosen considering the more recurrent terms which is directly related to their relevance in the field.Also, the proximity of terms in the map was representative of how closely related they were, despite having a co-occurrence or not.Nevertheless, in some cases, a term linked with many other terms that were not related, can appear at a longer distance, being placed in the middle of all its connections.There was also the aggregation of terms that were homonyms but had different designations amongst the published papers.Based on these connections and the terms found in the research, it is possible to define two main clusters containing terms that are representative of different fields of study: one related to LSC devices and the other one related to photoluminescence spectroscopy (Fig. 2).The most relevant terms which are directly connected with the 'luminescent solar concentrator' term are highlighted in Fig. 2 and those are the ones addressed in the dataset here reported.The optical features and the materials classes are the links between the clusters as the indexing terms such as, for instance, emission, absorption, quantum yield, lanthanide, carbon dot, dimension, or film are shared 234,235 .
The dataset is composed of a description of the materials used to fabricate the LSC device, in what concerns the optically active centres type and concentration, the host material, and the processing methods.Only downshifting examples were considered as they are the vast majority of reported cases, although LSCs based on other energy conversion mechanisms such as upconversion 236,237 or downconversion 238 are already available.Numerical data (when available) were also manually extracted from each of the published papers to compose the table dataset.The numerical values considered for the optical characterization parameters include: i) wavelength of peak absorption or excitation (A p ), ii) minimum wavelength of the absorption or excitation spectral band (A min ), iii) maximum wavelength of the absorption or excitation spectral band (A max ), iv) wavelength of peak emission (E p ), v) minimum wavelength of the emission spectral band (E min ), vi) maximum wavelength of the emission spectral range (E max ), and vii) emission quantum yield (η yield ) 239 .The general optical features are based on photoluminescence data and absorption spectra.Figure 3 illustrates the excitation and emission spectra of one reported LSC based on lanthanide ions 110 in which the relevant parameters were assigned as follows: (i) A p : wavelength at which the intensity reaches a peak value in the excitation (or absorption) spectrum measured in nanometres (nm).(ii) E p : wavelength at which the emission spectrum has a maximum intensity value, measured in nm.(iii) A min : low-wavelength value of the excitation (or absorption) spectrum, measured in nm.In most cases, the value of 300 nm is considered as a threshold because below this the solar irradiance is very low (∼10 −4 % of the total solar irradiance on Earth).(iv) A max : high-wavelength value of the excitation (or absorption) spectrum, where the intensity exhibits significant deviation from the noise level (>5%), measured in nm.(v) E min : low-wavelength value of the emission spectrum, where the intensity exhibits significant deviation from the noise level (>5%), measured in nm.(vi) E max : high-wavelength value of the emission spectrum, where the intensity exhibits significant deviation from the noise level (>5%), measured in nm.
Hence, the compiled dataset is highly representative of the field, capturing a comprehensive range of optical properties and characteristics about LSCs and related materials.
The dataset is also composed of the so-called performance features like η opt and PCE, which are intrinsically dependent on the dimensions of the LSC device (Fig. 1a,b), and thus this information is also provided in the dataset.By definition, η opt is a measure of the ratio between the output optical power and the incident one: Experimental optical measures of P out and P in are performed using integrating spheres or power meters to calculate η opt using Eq. 1 (from this point onwards, it will be referred to as the definition equation).These parameters can also be estimated when the LSCs are coupled to a photovoltaic cell.In this scenario, the literature provides various models (expressions) that can be employed to establish a correlation between the measured electrical parameter in the photovoltaic cell and the optical power.These models offer different levels of accuracy, allowing for a more comprehensive analysis of the relationship between the two variables.Among these, Eqs. 2, 3 are frequently employed to quantify η opt .While Eq. 2 (higher accuracy equation) provides high accuracy by precisely incorporating the efficiency of the PV cell to correct the spectral response, Eq. 3 (lower accuracy equation) is a rougher approximation.Equation 2 is defined as follows 50 : where I SC L and V 0 L represent the short-circuit current and the open voltage of the photovoltaic cell coupled to the LSC, respectively (I sc and V 0 are the corresponding values of the photovoltaic cell exposed directly to solar radiation), η solar is the efficiency of the photovoltaic cell relative to the total solar spectrum, η PV is the efficiency of the photovoltaic cell at the LSC emission wavelengths, A e is the LSC edge area, and A s is the top surface area of the LSC 50 .An alternative definition, Eq. 3, is given by 170 : There is also more theoretical approach (theoretical equation), which considers that η opt can be described by weighting all the main optical losses found in the LSC (most of them can be assessed experimentally), given by the product of the several terms in Eq. 4 240 : Fig. 3 Optical features description.Excitation spectrum monitored at 612 nm and emission spectrum excited at 370 nm for a selected Eu 3+ -based organic-inorganic hybrid 110 .The shadowed area represents the Air Mass 1.5 Global solar spectrum (AM1.5 G, the spectrum generally used in terrestrial solar cell research, right y axis).
opt a bs SA yield Stokes trap mat in which R is the Fresnel reflection coefficient for perpendicular incidence, η abs is the ratio of photons absorbed by the emitting layer to the number of photons falling on it, η SA is the self-absorption efficiency 241 , η Stokes is the Stokes efficiency, η trap is the trapping efficiency and η mat takes into account the transport losses due to matrix absorption and scattering.This suggests that different equations yield comparable results in terms of optical conversion efficiency.However, it is worth noting that Eq. 3 is more commonly utilized.The PCE figure of merit is obtained from experimental data using the following Eq.5: where FF is the fill factor of the photovoltaic cell.The PCE figure of merit correlates the output electrical power (which is directly dependent of the PV cell in use) to the incident optical one.We note that the number of entries in the dataset is somewhat limited because, although the number of publications is increasing over the last 15 years (total publications ∼1500), there is a significant amount (∼80%) of published works on luminescent solar concentrators which lack performance quantification related either to η opt or to PCE.This results in ∼300 published works with LSC performance quantification, matching the number of entries in the dataset.

Data Records
The complete dataset is available at figshare 242 .The data is contained in an Excel file (.xlsx file, composed of 27 columns and 305 entries, which provides the data and the details of the dataset.The here presented dataset has the key to columns and units presented in the following tables, divided in two types: i) materials and the manufacturing processing categories (Table 1) and ii) numerical values considered for the optical features and electrical characterization parameters (Table 2).Table 3 describes the columns which were included in the dataset to facilitate identification and tracking of the reported LSC, such as designation, publication year and DOI of the source published work.

technical Validation
To ensure data integrity and quality, only data extracted from published works in SCI-indexed journals were considered.The data related with spectroscopic features (emission and absorption/excitation) were either taken directly from the text when the figures were fully described or extracted from presented graphs (spectra), which may cause some value misreading, inducing an estimated deviation of ±10 nm.For numerical data (OI, OI*, η yield , η opt and PCE), the values were extracted from the main text as reported by the authors.In what concerns the experimental data, the spectroscopic data presents the deviation associated with the measuring equipment, which is typically around 2 nm.For the η yield , it is important to note that the values are typically within a 10% error range, as typically stated by the manufacturer of the integrating spheres apparatus, probably related with detector sensitivity limitations and software calculations.
The error associated with the calculated values of η opt and PCE was estimated using the error propagation method, which generally induces a relative error of Δη opt /η opt and ΔPCE/PCE below 5%.The η opt associated error is given by:

Usage Notes
The dataset presented in this work intends to be a pivotal resource for researchers and engineers working on the field of optical materials for down-shifting conversion for building-integrated photovoltaics.This comprehensive dataset is suitable for data driven analysis and models that may predict the efficiency of new LSCs without extensive experimental measurements.It can be continuously expanded and augmented in the future, offering the opportunity for data mining and may serve as training data for ML models.

Fig. 1
Fig. 1 Luminescent solar concentrator concept.Scheme of (a) planar and (b) fiber-based LSC with dimensions: (l -length, w -width, t -thickness, d in -inner diameter, d out -outer diameter) in the hollow-core configuration.In this case, the doped layer is in the core (d in ).The 4 edges of the planar LSCs may be coupled to PV cells or mirrors (or reflective tapes).

Fig. 2
Fig.2Network visualization of term occurrences extracted from abstracts and titles in 1322 publications from Web of Science TM principal collection in the period 1976-2023, using 'luminescent solar concentrator' as the search keyword.A threshold cutoff of 10 as a number of term co-occurrence was used.The diameter of the circles is directly proportional to the number of occurrences of an indexing term, and the distance is directly proportional to the relation between them on the map (the closer two indexing terms are the more related they are).The highlighted lines represent the direct connections with the 'luminescent solar concentrator' term.

3 .
to describe the LSC: optical centre -host year the publication year of the source paper from which the data is obtained DOI the source paper DOI, allowing for easy identification and citationTable Other information provided in the dataset related to the listed devices.measuringthe LSC dimensions, which can be done using a measuring tape/ruler or a caliper with a 5 × 10 −4 or 5 × 10 −5 m error, respectively.The PCE associated error is given by:

Table 1 .
assuming an usual multimeter, as the 2400 SourceMeter SMU Instruments, Keithley), Δη solar = Δ PV = 0.01 and ΔA e and ΔA s account for the error in Parameters included in the dataset related to the materials and the manufacturing processing.

Table 2 .
Parameters included in the dataset related to numerical values considered for the optical and electrical performance quantification.
a Absolute values.